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ABSTRACT 

Among  the  earliest  successful  applications  of  multi-layered  neural  networks  are  combina¬ 
torial  optimization  problems,  most  notably,  the  travelling  salesman  problem.  Hopfield-type 
thermodynamic  networks  comprised  of  functionally  homogenous  visible  units  have  been  ap¬ 
plied  to  a  variety  of  structurally  simple  /VP-hard  optimization  problems.  A  fundamental  obsta¬ 
cle  to  the  application  of  neural  networks  to  difficult  problems  is  that  these  problems  must  first 
be  reduced  to  0-1  Hamiltonian  optimization  problems.  We  show  that  certain  optimization 
problems  cannot  be  embedded  in  networks  composed  entirely  of  visible  units  and  present  a 
method  for  defining  necessary  hidden  units  together  with  their  best  features.  We  derive  a 
knapsack-packing  network  of  O(n)  units  with  both  standard  and  conjuntive  synapses.  En¬ 
couraging  simulation  results  are  cited.  /"  ~ 
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Introduction 


Since  their  resurgence  in  the  early  1980s  artificial  neural 
systems  have  found  applications  in  computer  vision,  speech 
generation  and  recognition,  robotics,  and  numerous  other  areas. 
Progress  toward  the  application  of  neural  networks  to  NF-r. ard 
combinatorial  optimization  problems  has  been  modest  and  has  beer, 
generally  restricted  to  structurally  simple  problems.  The  first 
and  most  well-known  application  was  presented  by  Hopfield  and 
Tank  rHoc85]  who  reduced  the  travelling  salesman,  cot im.i cat i s n. 
problem  (TSP)  to  a  0-1  quadratic  assignment  optimization  problem 
(QAP)  with  a  Hamiltonian  objective  function.  Hopfield  and  Tank 
then  showed  that  a  thermodynamic  neural  network  with  symmetric 
connections  and  a  non-linear  sigmoid  transfer  function  could 
effectively  find  good  solutions  to  embedded  TS?  proolems .  The 
approach  to  reducing  optimization  problems  that  was  proposed  by 
Hopfield  and  Tank  has  since  been  applied  to  numerous  other 
combinatorial  optimization  problems.  J.  Ramanujam  and 
P.  Sadayappan  present  reductions  of  graph  partitioning,  graph 
K-part it ioning,  minimum  vertex  cover,  maximum  independent  set, 
maximum  clique,  set  partition,  and  maximum  matching  to  QAPs 
[Ram88] .  E.D.  Dahl  presents  reductions  of  map  and  graph  coloring 
problems  [Dah88]. 

A  common  characteristic  of  all  these  problems  is  that  they 
are  structurally  simple  and  can  easily  be  reduced  to  QAPs  with 
Hamiltonian  objective  functions. 


resulting  neural  networks  consist  of  functionally  homogeneous 
processing  units  whose  activation  values  are  directly  mapped  to 
the  solutions  of  their  respective  embedded  optimization 
problems.  Since  these  units  participate  in  the  expression  of 
problem  solutions  for  external  interpretation,  they  can  ce 
vrewed  as  visible  units. 

The  Set  Partition  Problem 

Before  we  consider  a  difficult  optimization  problem,  we 
review  Hopfield  and  Tank's  reduction  technique  on  a  simple 
problem.  The  integer  set  partition  optimization  problem  is  given 
by 


INSTANCE:  Finite  set  A  of  elements,  for  each  as  A,  a  size 
bas  Z+. 


OBJECTIVE:  Give  a  subset  A'QA  such  that 

X  ha  ~  X  ha 

as  A  '  as A-A ' 
is  minimized  over  all  subsets  of  A. 

The  set  partition  decision  problem  is  known  to  be  ATP- complete 
r  Ga  r 7  9  ]  . 

Ler  V  be  the  variable  space  of  the  set  partition  problem. 
V  is  the  set  of  all  subsets  of  A  so  that  I V l =2 1 ^ 1  .  Let  S  be  the 
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space  defined  by  the  volume  of  an  n-dimensicr.al  unit  hypercube 
for  some,  as  of  yet  unspecified  n.  Each  point  in  V  or 
configuration  of  problem  variables  is  associated  with  a 
configuration  of  the  states  of  n  neuron-like  units,  that  is,  a 
point  in  S.  This  association  is  defined  by  a  pair  of  mappings 
M:V— >S,  and  M~l:  S— >V.  We  could  define  n  =  2IAI  and  map  each 

subset  of  A  to  a  unique  vector  in  the  set  of  n  orthogonal, 
unit-length,  binary  vectors.  However,  this  results  in  a  neural 
network  whose  size  scales  exponent  cal ly  with  the  scze  of  the 
problem . 

A  better  approach  is  to  define  n-  |A|  and  to  map  each 
subset  of  A  to  a  unique  combination  of  n  binary  values.  Let  us 
name  the  elements  of  A  by  {  a i,  32/  . . .,  an  }  .  We  define  M,  for 
the  subset  of  S  consisting  of  the  corners  of  the  n-dimens  io.na  1 
hypercube,  by 

M(A)  =  (  Si,  S2,  .  .  .  ,  Sn  ) 

where 

1  if  3j€  A' 

0  if  aie  A  -  A' 

A'Q  A,  1  <  i  <  n . 

In  order  to  extract  a  useful  problem  soluti.cn  from  a  network 
configuration,  we  must  also  define  M-1.  This  is  typically  done 
via  application  of  a  threshold,  X  : 
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M~l  (  Si,  s 2,  .  ■  .  ,  sn  ) 


{ 


Si>%  } 


0  <  X  <  1 »  I  <  i  <  n  . 

This  completes  our  selection  of  a  representation,  that  is,  the 
definition  of  n  and  the  mapping  of  problem  variable 
configurations  to,  and  from  the  set  of  global  network 
configurations . 

The  next  step  in  the  derivation  process  is  the  selection  of 
an  appropriate  energy  function  E:  S— Hopfield  and  Tank  showed 

that  in  order  to  embed  the  problem  in  a  network  of  neuron-like 
processing  units,  the  energy  function  must  be  expressible  as  a 
0-1  Hamiltonian.  Hamiltonian  energy  functions  have  the  form  - 

n  n  n 

E  =  -  ~  ^  I  svs,  wij  +  ^  Si  Oi 

i = 1 j  =  1  i=l 

where  n  is  the  number  of  processing  units  in  the  network,  Si  is 
the  activation  level  of  the  i th  unit,  wy j  is  the  connection 
strength  between  the  i th  and  j th  units,  and  6i  is  the  activation 
threshold  of  the  i th  unit.  All  connections  are  symmetric,  i.e., 
wij-Wji  for  all  i,  j.  The  fundamental  obstacles  to  embedding 
arbitrary  optimization  problems  in  neural  networks  are  the 
discovery  of  a  representation  and  a  Hamiltonian  energy  function 
so  that  minimal-energy  network  configurations  are  mapped  to 
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optimal  con: igurations  of  problem  variables,  .-or  the  set 
partition  problem,  we  choose  the  energy  function  - 


P  =  P  ■ 


f  n 

X  « 

U-i 


bi 


n 

It  can  easily  be  seen  that  ^  S£  b±  is  the  sum  of  sizes 

i  =  i 


elements  in  A'  and  ^  ( 1-Sj) bj  is  the 


A-A\  Without  some  insight  into  the  forms  of  valid  energy 
functions,  we  must  manipulate  E  algebraically  before  we  can 
certain  that  it  can  be  expressed  as  a  0-1  Hamiltonian. 


(  n 

n 

2 

E 

= 

B  * 

x 

Si  bi  - 

I 

(1 

-Si)  bi 

U=l 

i= 

=  1 

j 
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-> 

=  B  • 

2 

x 

Si  b. 

i 

M 

bi 

\ 

i  =  i 

i=l 
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n 

f  n 

> 

=  B  ■ 

4 

x 

X 

si  Sj  bi 

b j 

- 

S: 

bi 

X 

bj  i 

+■ 

i  =  l 

j=  1 

i  =  l 

U=1 

j 

ti 

Since 


(  n 

X 


U=1 


>  2 


is  constant  with  respect  to  s±  we  can  d 


it  from  E  without  affecting  the  minima.  The  constant 
coefficients  (4)  can  be  included  in  B  and  are  dropped  as 
We  are  left  with 


we . 
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Ch>  * 


n 

n 

n 

( 

n  \ 

II 

CD 

M 

s 

Si  Sj  bib  j  -  S  ■  X 

Si  bi 

i  ^ 

i=l 

>1 

i=l 

U-1  J 

^  n 

n 

n 

( 

n  \ 

-  -is 

s 

Si  Sj  (-23  bi  bi)  + 

S  »i 

L 

i  =  l 

j=l 

i  =  l 

I 

<  j 

=1  ; 

This  expression  is  in  the  standard  form  of  a  0-1  Hamiltonian 


ana 


w  j  —  —  2.  3  d  2,  jd  j 

n 

Bi  =  -Bbi^  bj  . 

j-1 


The  derivative  of  E  allows  us  to  deduce  the  function  of  unit  i  - 


d£ 

dsi 


n 

=  -  X  Sj  w±j  +  9 j 

j-i 


f 

=  2  Bbi 

V 


J 


~n  orde 

C  °  . 
rJS 


r  to 
This 


minimize  E,  the  value  of  Si  should  increase  whenever 
corresponds  to  the  condition 


X  3jbj  <  bi  ■ 
j-l  j-i 


6 


Each  unit  behaves  as  a  feature  detector,  detecting  the  condi 
in  which  the  sum  of  the  sizes  of  elements  in  A'  is  less  than 
half  of  the  total  sum  of  sizes.  By  detecting  this  feature,  u 
i  increases  its  level  of  activation  and  gradually  moves  its 
associated  element  ai  from  A-A'  into  A'.  The  result  is  a 
decrease  the  discrepancy  between  the  sums  of  sizes  of  elemer. 
in  the  two  sets. 

The  local  function  of  unit  i  (1  <  i  <  r.)  in  a  Hcpfieli-cy 
network  is  criven  bv 


repeat 


A  E  i - £  SjWij  +  Oi  ; 

1-1 

( _ 1 


t-Si  +  (1-T) • 


:  /  T  ) 


1  +  e 


until  ( externally  terminated); 


Where  t  (  0  <  X  <  1 )  controls  the  response  time  of  the  unit.  As 
T 1  0,  the  trajectory  of  the  network  configuration  becomes 
increasingly  smooth.  Units  can  be  updated  either  synchronous 
as  prescribed  by  Hopfield  and  Tank,  or  asynchronously.  After 
relaxation,  the  approximate  solution  to  the  embedded  set 
partition  problem  is  extracted  from  the  configuration  of 
activation  values  by  application  of  M~l. 

It  should  be  noted  that  V  is  often  a  discrete  space  whi 
S  is  continuous.  The  set  of  minima  of  a  continuous 


7 


3-1  Hamiltonian  must  be  a  subset  of  the  set  of  corners  of  the 
n-dimens ional  unit  hypercube.  There  are  few  restrictions  on  M 
except  those  imposed  by  the  definitions  of  its  domain  and  range. 
If  there  is  no  prior  knowledge  about  the  minima  of  f,  then  we 
must  take  care  to  ensure  that  all  points  in  V  are  mapped  by  M  tc 
corners  of  the  hypercube.  Otherwise,  we  can  not  be  certain  that 
the  minima  of  f  map  to  minima  of  E. 

let  us  summarize  the  basic  techniques  that  are  employed  tc 
fend  solutions  to  optimization  problems  using  neural  networks 

1  A  network  representation  is  selected,  n,  the 
dimensionality  of  the  network  state  space 
(alternatively,  the  number  of  units  in  the  network 
architecture)  is  defined.  A  transformation,  M:V—>Sr 

from  V,  the  space  of  problem  variable  configurations, 
to  S =  [0,l]n,  the  volume  of  an  n-dimensional  unit 
hypercube,  is  defined. 

2  An  inverse  transformation,  >V,  is  defined.  Each 

network  configuration  in  the  volume  of  the 
n-dimensional  hypercube  is  mapped  by  M~l  to  a 
configuration  of  the  variables  of  the  problem  space. 
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An  energy  function,  E  :S— >5^,  is  defined  on  S  so  t 
M  ( Srnin)  is  a  minimum  of  the  problem  objective, 
whenever  smin,  is  a  minimum  of  E,  smin€S.  In 


r*  v  fi  o 


hr ... d  y  _ 


utilize  processing  units  with  neuron-like 
must  take  the  form  of  a  0-1  Hamiltonian. 

The  network  is  relaxed  using  one  of  a  variety  c 
update  rules.  Prior  to  relaxation,  all  network 
parameters,  (e.g.  5  in  the  set  partitirn  crcble: 


cperational  parameters  (e.g.  t) 


i  v~  ^  C'pO  ' 


After  relaxation,  M_1  is  applied  to  the  final  r.e: 
configuration  yielding  a  minimum  of  f,  the  cbjec: 
function  of  the  original  optimization  problem. 


Steps  1-3  are,  without  question,  the  most  difficult  and  req 
a  degree  of  cleverness  to  carry  out.  The  technique  is  summa 
in  figure  1. 
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Problem  instance  I. 


M 

(Exi stance  is  not 
guaranteed) 

1  F 

O 

Continuous 
partial-derivative 
of  E 

\  F 

o 


Initialization  & 
relaxation 


Binary  representation  & 
Hamiltonian  energy 
function  E 

Network  specification 
including  connection 
weights  and  visible,  analog, 
"neuron-like"  units. 


i 


M 


Network  solution  to  M  (I). 


Approximate  solution  of 
problem  I. 


Figure  1 

Existing  Thermodynamic  Neural  Network 
Derivation  Procedure 
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In  3 cine  oense,  we  have  formalized  the  reduction  of  iis cr-.-t 
minimization  problems  to  the  minimization  of  continuous 
0-1  Hamiltonians.  In  subsequent  discussions  we  will,  in 
i  :  easier.,  refer  to  these  formal  concepts  but  will  nit  runlet 
curse  Ives  with  strict  adherence  to  formality. 

The  Knapsack  Problem 

Let  us  consider  the  integer  knapsack  problem  giver,  by 

INSTANCE :  Finite  set  Q  =  { 1 , 2 ,  .  .  . , n  •  of  elements,  fir  each 
j  e  Q  a  cost  Wq.e  Z  +  and  a  profit  pqe  Z  +  and  a  positive 

integer  knapsack  capacity,  K. 


OBJECTIVE:  Give  a  subset  Q'QQ  such  the 


X  w<j  - K  and  X  * 

qe  5'  q?  J’ 


Pq  is  maximized  over  a__  sucse* 


e  r  .■s. :  t 


As  witn  tne  set  partition  decision  prooiem,  tr.e  irr 
decision  problem  is  known  to  be  iVP- complete  [Carl1  9  j  .  The 
standard  approach  for  embedding  this  problem  in  a  neural  r.etwc 
calls  for  the  mapping  of  problem  variable  configurations,  i.e. 
subsets  of  Q,  to  configurations  of  a  set  of  n  0-1  variables 


representing  the  activation  values,  slr  1 <i<n,  o: 
processing  units.  The  simplest  representation  is 


:eur 


ii 


Si  = 


if  is  Q  ' 


L  0  otherwise  -I 


where  n=  IQ!.  We  propose  a  global  energy  function,  E  =  Ei  -  £  = 

where  E&  is  a  term  reflecting  the  benefit  of  maximizing  ^  c~ 

ge  5’ 

and  £g  is  a  term  reflecting  the  penalty  for  violating 

I *  *■ 

What  features  should  be  detected  by  unit  i  (  1  <i<n)?  Since 
si  =  1  whenever  element  i  is  placed  in  the  knapsack,  sv  should 
attain  this  value  whenever  the  benefit  (with  respect  to  minimizing 
E)  of  placing  element  i  in  the  knapsack  outweighs  the  penalty.  In 


other  words 

Asv 

A  w 

>  0  whenever  0 —  -* 
As  • 

AE=  AEz 

■  <  0  .  If  ~  + 

As:  As i 

AE  = 

ZIT  Car‘  be 

computed  as 

the 

sum  of  weighted 

activation  values 

of  S:  (  j  * 

-  ^  Jti  ^  w  0  ^  U 

net  i 

on  of  unit  i  will 

be  to  detect  the 

cor.dit  ion 

Cl 


AEa  A £3 
Asi  Asi 


When  Asi  =  l,  A£^  should  be  proportional  to  -p±  because  the 
inclusion  of  element  i  in  the  knapsack  increases  the  profit  of 
the  packing  by  effecting  a  decrease  in  £.  The  energy  penalty, 
A£q,  depends  on  the  states  of  the  other  units  in  the  network. 
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C2 


f  £  \ 

K  <  2_,  sjw-i 

<  :  -  i  V 


If  C2  is  true  then  AEb  should  be  proportional  to  >/y  .  Ir  C2  is 

"strongly"  false  then  there  should  be  no  energy  penalty,  that 

4 

is,  AEq  =  0.  ~^i~  is  thus  a  non-linear  function  or  the  sun  of 

n 

weighted  activations  ^  s-jw-  .  In  trier  to  express  this 

'  *  1 

non-linear  relationship,  some  unit,  h,  must  be  defined  to  detect 
condition  C2 .  Recall  that  the  role  of  unit  i  is  to  detect 
condition  Cl.  We  conclude  that  at  least  two  different  kinds  of 
feature  detectors  are  necessary.  The  knapsack  problem  can  not  be 
embedded  in  a  network  of  functionally  homogeneous  visible  units. 
Hidden  units  are  necessary. 

In  the  remainder  of  this  paper,  we  propose  a  systematic 
approach  to  defining  necessary  hidden  units  together  with  their 
best  features .  Our  approach  is  applicable  to  many  packing 
problems  including  bin  packing  and  multiprocessor  scheduling. 

Plausible  Energy  Functions 

We  have  presented  an  informal  argument  that  hidden  units 
are  required  by  a  neural  network  in  order  to  pack  a  knapsack.  We 
reinforce  this  argument  by  examining  two  invalid  candidate 
energy  functions. 
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Let 

n  C  n 

E  =  -A-  ^  Si  Pi  +  B-  ^  Si  w  i  -  K 
i  =  1  \  i  =  1 

This  function  is  clearly  inappropriate.  Consider  the  problem 
instance  Q={1},  wi  =  K-£,  and  pi  =  e  where  0<£«1.  For  any 
fixed  values  of  A,  B,  and  K,  £  can  be  selected  so  that  the 
single  element,  1,  will  not  be  placed  in  the  knapsack  even 
though  it  has  positive  profit  and  the  packing  dees  net  overflew. 
When  the  algebraic  manipulations  prescribed  by  Kopfield  and  Tank 
are  applied  to  El,  the  result  is  a  network  in  which  unit  i 
(1  <i<n)  detects  the  condition  - 

Bwi  -  Api  <  0 . 

In  this  particular  network  there  are  no  connections.  Each  unit 
independently  becomes  active  whenever  the  weighted  profit  of  its 
element  exceeds  the  cost  regardless  of  the  current  contents  of 
the  knapsack. 

For  our  second  example,  let 

n  /  n  \  2 

E  =  -A-  ^  Si  pi  +  B-  ^  Si  hr  i  -  K  .  E2 

i=l  l  i=l 
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This  function  would  at  first  glance  seem  sufficient.  It 

penalizes  knapsack  overflow  and  does  not  reward  packings  that 
n 

minimize  ^  s±  wv  .  It  fails  in  that  the  energy  penalty,  A£q  , 

associated  with  an  unused  knapsack  capacity  is  the  same  as 

penalty  for  an  overflow  of  size  .  It  is  theoretically  ccssic 

to  achieve  a  configuration,  that  is,  a  binary  assignment  tf  s 

n 

for  all  1  <  j  <  n  where  j  X  i,  in  which  K  -  ^  s-  w-t  =  and  w  L  =  2. 

'  X  _ 

for  :<  >  0 .  The  penalty,  <\Eb  ,  resulting  frcm  the  inclusim  tf 

element  i  in  the  knapsack  is  the  same  as  the  penalty  results:, 
from  the  exclusion  element  i.  However,  since  p ±  >  C  we  have 

<0.  In  this  configuration,  the  network  will  always  includ 

element  i  even  though  the  knapsack  capacity  is  exceeded.  Vary 
A  and  B  can  not  alleviate  this  invalid  bias  since  the  crcblem 
lies  in  the  symmetry  of  the  quadratic  term  £3  . 

A  Functional  Knapsack-Packing  Network 

We  have  examined  both  linear  and  quadratic  forms  for  the 
overflow  penalty,  £3,  and  have  shown  both  to  be  inadequate 
given  the  proposed  mapping  of  the  problem  variable  space  to  t 
space  of  activation  values.  In  order  to  proceed,  we  ackncwled 
that  an  energy  function  of  the  activation  values  of  functions 
homogeneous  units  can  not  have  the  form  of  a  Hamiltonian  and 
construct  a  network,  Gl,  of  non-neural,  binary-state  processi 
units  with  a  non-Hami ltonian  energy  function. 
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and  A,  B>  0.  These  terms  are  intuitively  obvious.  is 

minimized  by  maximizing  £  pq,  the  profit  of  the  packing.  £=  is 

geC 

minimized  when  knapsack  overflow  is  minimized.  Any  feasible 

knapsack  packing,  Q\  for  which  ^  wq  <  K  will  net  incur  any 

g  e  Q' 

energy  penalty.  A  simple  derivation  (see  Appendix  A)  yields 
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AR  A£A  ARg 
Asi  Asj  As  2 


A  Ea 
A  si 


-Ac  2 


ARg 
A  S'  2 


a 


W 


I 


B(wi 


K  + 


From  the  discrete  differential  of  Er  the  function  of  each  unit 
in  network  G1  can  be  determined.  Rather  than  adopting  a  strict 
threshold  rule  of  the  form  - 


if 


0  then 


Si  <—  0 


else 

S  i  < —  1 


we  employ  stochastic  smoothing  of  the  type  used 
a  Boltzmann  machine  [Hin84]  - 
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1 


U(0,1)  < 


1  +  e 


-A£ 
T  As 


then 


else 


/"ere  13(0,1)  is  a  continuous  uniform  random  number .  Aa 


:p  lying  Boltzmann  Machines  to  TS?  problems.  For  notaricnal 

AE 

:r.ve nience,  we  will  use  AFT  to  mean  -^y~  m  discussions  wnere 


ire  is  no  ambiguity 
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nee work  G1  is 


Tne  rune: 


.on  o: 


Our  next  step  is  to  transform  G1  into  a  mean-field  model 
like  those  of  Hopfieid  and  Tank .  Let  G2  be  a  network  of  analog 
processing  units  that  is  isomorphic  to  Gl,  our  network  of 
discrete  units.  Each  unit  i  of  G2  models  the  expected  value  of 
the  activation  value  of  the  corresponding  unit  in  Gl .  Thus  in 
network  G2,  unit  i  will  be  updated  according  to 
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«“  X  SJ  ; 

*  . 

A£  -P.pi  ; 

if  qi  >  /C  then 

&£  i —  A.E  +  3wi 
else  if  q±  +  wj_  >  K  then 

A E  i —  A  E  +  B  (  w;  +  q_£  -  K  )  ; 


1  +  0  ' 

Unit  Function 

Analog  Network.  G2,  unit  i  (1  <  i  <  n) 


Depending  on  the  specific  network  implementation,  G2  may 
may  not  have  the  capacity  to  avoid  local  minima  of  its  er.erg 
ction.  We  postpone  a  discussion  of  local  minima  avoidance  to 
ater  section.  Network  G2  utilises  asymmetric  connections.  Al 
sections  leaving  unit  i  have  weights  and  every  unit  has 


sections  to,  and  from  every  other  unit.  Given  a  digita 


.ti-crocessor  or  ar 


i y  processor,  we 


knapsack  by  simulating  network  G2  with  one  processor  allocates 


to  each  unit.  It  is  also  possible  to  design  special  analog 

n 

circuitry  to  evaluate  AE  as  a  function  of  ^  s~  in  which 


activation  value,  Sj,  is  modeled  by  a  continuous  voltage  or. 

'  0 ,  1  ]  .  Unfortunately,  such  a  circuit  would  be  decidedly 

rcn-neural .  Since  programmable  neural  network  hardware  will 
r  ^  c  o  r*  0  ci  I. '/  o  s  3,  v  3  L  3.  c  2.  °  i.  n  c  ri  0  r.  *0  3.  ~  *■  ^  ^  —  3  u  -  ^  c  ~  ^  i.  ’  *  “  ■.  / 1 

be  to  derive  networks  of  units  that  model  neuron- like  behavior, 
i.e.,  continuous  analog  integration  and  thresholding. 
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The  function 


i  in  G2  can  be  rewritten  as  follows  - 


n 

qi  <-  Xs!  VJ  ; 

;  *  l 

A£  < —  —  Ap  ; 

if  ( gi  >  ff  )  then 

A£  <—  A£  -  Bwi  ; 

if  (  g v  <  £  )  and  ( g  v  -  :•/;  >  £)  then 
A£  < —  A£  +  3  (  T  g_v  -  K  )  ; 


(-AE  /  T  ) 


1  +  e 


Unit  Function 

Modified  Analog  Network  G2,  unit  i  (1 <  i  <  n) 


Let  us  rewrite  this  function  again  by  replacing  implicit 
ry  threshold  functions  with  explicit  functions.  Let 


BTHRESH (x) 


1 

0 


if  x  >  0 
otherwise  - 
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-r.e  run 


ur.  it 


G2  ( 1  <  i  <  r. ) 


n 

X  s  j  wj  : 

;  <  ; 

cli  <r-  BTHRESH  ( 

qi  -  K  )  ; 

c2i  f -  BTHRESH ( 

K  -  qi  +  £  )  ; 

c3i  BTHRESH  ( 

c231  <r-  BTHRESH  ( 

qi  —  K  ~  wy  )  ; 

h 

A£  <—  ~Api  +  cli 

•  3  w2  -  c23 i  •  3  (  Wi  ~  qi  -  K  )  ; 

1 

(-AE/T)  . 

Si  ^  1  +  e 

Unit  Function 

Modified  Analog  Network  G2,  unit  i  (1  <  i  <  r.) 

Explicit  representation  of  Binary-Threshold  Functions 

where  D>  0  is  a  constant  network  parameter  (similar  to  A  or  3) 
and  0  <  £  <<  1  is  utilized  to  detect  the  condition  "q1  =  K".  This 
function  deserves  a  brief  explanation.  Since  cli  =  1  whenever 
q±-K>  0  and  cli  =  0  whenever  q±  -  K  <  0 ,  by  adding  cl1  •  Bw*  to  A£  we 
preserve  the  function  of  the  statement 

if  (qi  >  K  )  then 

A E  i —  A E  +  Bwi  ; 
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Since  c23i  =  1  if,  and  only  if  c2i  -  1  and  c3x-  1,  it  is  clear  th 
z231  =  1  whenever  (q±<K)  and  (qi  +  wi>K)  and  c23j_  =  0  otherwise. 
Multiplicat '  in  of  3 ( w±  -  qi  -  K)  by  prior  tc  summation  with 

A£  suffices  to  increment  A£  by  3  (w±  +  q±  -  K  )  whenever  {q±<  K  )  s 
\2i~wi>K)  .  This  change  preserves  the  function  of  the  s  tat  erne 

(gf  <  £  )  and  ( g v  *  wy  >  K)  then 
A£  < —  A£  —  3  (  >/.  —  g_^  -  E  )  ; 


The  energy  function  associated  with  network  G2,  does  not 
'.ave  a  continuous  derivative.  As  q±  Ttf,  unit  i  strives  to 


.ncrease  its  level  of  activation  since  this  produces  a  decreas 

.  -Ac. 

As:  *  " 

A 

;he  behavior  of  unit  i  suddenly  changes  because  the  gradient  t 


.n  E  as  dictated  by  the  gradient  T7U:  =  -Apy .  At  the  moment  oy  > 


;ecomes  positive.  This  sudden  change  in  behavior  oar.  resul 

:he  problem 

nstanoe 


packing . 

For 

example,  consi 

K  =  10, 

Q  =  {  1, 

2, 

3  } ,  n  =  3 

w  =  (  3, 

3 , 

11  ) 

P  =  (  1, 

1, 

10  ) 

Since  the  profit  of  element  3  is  relatively  large,  unit  3  will 
dominate  the  activity  of  the  network  at  moderate  temperatures . 
As  the  network  is  cooled  and  S3— >1  the  condition  q±  >  K  will  at 
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rir.t  in  the  cooling  schedule  and  the  system  ma\ 
ifficient  tine  to  settle  into  a  conf iaurat ion  : 


v  U  O  i_  . .  a 


;er,  more  araaua_  coo-inc 


the  length  or  the  annealing  schedule  is  the  star.dar: 
ccmcensatinc  for  coarseness  of  the  enerav  landscace 


address  one  ur. : 


It  is  in  our  interest  to  smooth  the  energy  landscape 
defined  by  E3  so  that,  in  the  case  of  cur  example,  the  system  is 
affected  by  the  "oversize-ness"  of  element  3  early  in 
relaxation,  and  the  n-dimens icnal  trajectory  of  the  system 
configuration  is  smooth.  In  order  to  smooth  the  energy  landscape 
of  E3  we  utilize  the  same  technique  that  vieldeb  model  G2  from 


model  G1  -  stochastic  smooth inc  followed  b 


.  ra.os  format  ion  . 


function 


:lace  the  the  function  BTKRESH  (;■:) 


CTHRESH ( x)  = 


(-:</  T  )  . 


The  effect  of  this  replacement  is  illustrated  in  figure  3.  The 
result  is  an  efficient,  smoothed-enercv,  anaica  network  which  we 
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name  G3 .  In  the  low-temperature  limit  the  function a  of  net 
G2  and  G3  coincide  since 

'  0  if  :■:<  0  ' 

■  1T  C  THRESH  ( ;•;  )  = 

rto  L  1  i-  ->C  J 

=  B THRESH  (;■:)  (;<  *  Z)  . 

When  the  condition  (x=0)  must  be  detected,  we  add  £  to  ; 
to  the  application  of  CTHRESH  (0<£<<  1;  . 
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I  >] 


j  w  j  t 


(q±  -  :<) 


1  +  e 


(K  -  q.  -i-  £) 
T 


c3  i  < — 


1  +  e 


(q_i  +  Wi  -  K  ) 

r 

T 


(Dc2 i+Dc3 v- 


1  +  e 


A  £  <—  Api  -  cli  -  Bwi  -  c23i  -B  (  w±  -r  -  K  )  ; 


S>  1  +  e 


-A£  /T 


Unit  Function 

Smoothed-Energy  Analog  Network  G3,  unit  i  (1 <  i < n) 
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We  are  now  in  a  position  to  .take  an  interesting 
observation.  Since  the  values  cli,  c2i,  c3±,  and  c23i  are 
computed  continuously,  if  we  prevent  these  values  from  changing 
tcc  rapidly,  then  cli,  c2i,  cdg,  ana  c23z  car.  be  computet 
in  parallel .  We  simply  replace  the  entire  unit  with  a  network  • 
5  simpler  units  as  shown  in  figure  3.  We  refer  to  these  units 
cl-,  c2- ,  c3~ ,  c23- ,  and  s-units.  The  functions  of  these  units, 
described  in  algorithmic  form,  is  shown  in  figure  4.  We  refer  • 
the  resulting  network  as  G4 . 

The  astute  reader  will  notice  that  we  have  introduced 
neuron-like  behavior  to  each  unit.  We  can  now  interpret  the 
functions  of  all  units  as  feature  detectors  as  shown  in  figure 
5.  The  only  anomaly  of  network  G4  is  in  the  function  of  the 
s-unit  which  must  compute  a  weighted  product  of  the  incoming 
activation  signals  from  other  s-units  and  its  o23- unit.  This 
special  case  is  represented  by  a  higher-order  interaction  or 
conjunctive  synapse.  The  activation  value  of  the  cfu-unit  must 
moderate  the  transmission  of  the  activation  values  cf  all 
external  s-units  to  the  s-unit  in  its  cluster.  It  is  well  know: 
that  the  function  of  conjunctive  synapses  can  be  approximated, 
in  the  low-temperature  limit,  by  conjunctive  units.  Each 
conjunctive  synapse  is  simply  replaced  by  a  separate  unit  that 
detects  the  conjunction  of  the  c23-unit  and  the  s-unit  as  show: 
in  figure  6.  We  note  that 
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Figure  3 

Analog  Knapsack-Packing  Network  G4 
with  conjunctive  synapses 


‘.Hidden  'Units  : 

Cl 

c2 

c3 

c23 

Visibte  ‘Units  : 

S 


Knapsack 


Detects  when  the  knapsack 
overflows  given  that  element  i 
is  not  in  the  knapsack. 

Detects  the  absence  of  overflow 
in  the  knapsack  given  that 
element  i  is  not  in  the  knapsack. 

Detects  when  the  knapsack 
overflows  given  that  element  i 
is  in  the  knapsack. 

Detects  the  conjunction  of  the 
conditions  detected  by  c2  and 
c3  .  ,  that  is,  when  element  i 
produces  an  overflowing 
knapsack. 


Detects  the  condition  whereby 
the  benefit  of  including  element 
i  in  the  knapsack  exceeds  the 
cost. 

Figure  5 

Units  as  Feature  Detectors 


32 


Conjunctive  synapses  of  the  form 


can  be  replaced,  in  the  low-temperature  limit, 
by  conjunctive  units  of  the  form  : 


where  D  is  a  constant  used  to  control  the 
"decisiveness"  of  the  conjunction.  D  >  0. 

Figure  6 

Reduction  of  Synaptic  Order 
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a  C*Ji  qz 


3  C2 3 i  £ 
3=  1 
3*i 


SjWj 


-  B  1 


7=1 

7^i 


me 


r.at 


lim  a^-j  =  lim  c2Ji  s-j 
r  l  0  0 


lim  3  c23z  qz 
T  i  0 


1  im 


T  y  0 


j  *i 


ai 


w  -j  . 
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1  A  network  representation  is  selected,  n,  the 
dimensionality  of  the  network  state  space 
(alternatively,  the  number  of  units  in  the  network 
architect ure)  is  defined.  A  transformation,  ,V/:V— >S' , 

from  V,  the  space  of  problem  variable  configurations, 
to  S  '  =  { 0 ,  1 } 71 ,  the  corners  of  the  n-dimensior.al  unit 
hypercube,  is  defined. 

2  An  inverse  transformation,  :  S— »V,  is  defined.  Each 

network  configuration  in  the  volume  of  the 
^-dimensional  hypercube  is  mapped  by  M~'-  to  a 
configuration  of  the  variables  of  the  problem  space. 

3  A  discrete  energy  function,  £:S'— is  defined  on 
S'  so  that  M~'l(  smin)  is  a  minimum  of  the  problem 
objective,  f,  whenever  smin,  is  a  minimum  of  E, 

Smin€ S .  E  need  not  take  the  form  of  a 

0-1  Hamiltonian. 


36 


Figure  7 

3-Element  Analog  Knapsack-PacJcing  Network 
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giving  the  gradient  of  E  at  the  corners  of  the 
hvpercube  . 


A  network  of  non-neural ,  binary-state  processing 
is  constructed  (e.g.  Gl).  Stochastic  smoothing  is 
used  to  avoid  local  minima. 


The  mean-field  transformation  is  applied  to  the 
c ir.3,r7“3 u Htw  me cal  c  s  5 e ~ c  ”  •/  s  a  1  c  1  r.  c  a  vm r  <  1 

non-neural,  analog  processing  units.  This 

transformation  crovides  an  interne lat icn  of  ~~  i: 

As: 

interior  of  the  hypercube,  S. 


Implicit  binary-threshold  functions  are  identifie 
rewritten  explicitly  via  the  BTHRESH  function. 
Auxiliary  binary  variables  (e.g.  z22  )  are  assign* 
represent  boolean  combinations  ( and ,  or,  etc.)  o: 
simple  conditions. 

The  BTHRESH  function  is  replaced  with  the  sigmoid 
CTHRESH  function.  This  smooths  the  energy  landsca 

The  entire  non-neural  analog  unit  is  replaced  wit 
collection  of  hidden  units  together  with  a  single 
visible  unit.  A  hidden  unit  is  introduced  for  eac 


condition,  that  is, 


each  application  of  CTHRESH. 


(D 


?he  network  is  re-axea  using  one 


uodate  rules. 


runction  or  tne 


fur;r’"  ior*  Ej  r* 0 0 ci  n o z.  tslcs  g. ri 0  z o z m  o z  5.  H <3 mi  1. ~  c n  i. 3. r.  5. n  ci  r. 0 0 ' 
ever,  be  continuously  differentiable  cr.  the  interior  of  the 
hypercube.  Consequently,  we  will  not  have  neuron-like  unit 
behavior,  i.e.,  summation  and  thresholding.  Neuron-like  be: 
must  be  re-introduced  by  steps  5-9.  More  importantly,  e::r"; 
tear,  it  ion  of  imolicit  binarv  thresholds  that  are  cresett 


a.iows  us  to  eve.ntua*.v  rec*a:‘ 


units.  The  constructive  metr.oa,  tnat  is,  tne  sequence  or 
function  replacements,  simplifies  the  conceptual  "leap"  that 
must  be  taken  when  reducing  non-crivial  discrete  optimicatic: 
problems  to  neural-network  algorithms.  The  method  is  summer i: 
in  fiaure  8. 
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Problem  instance  I. 


M 


Discrete 
differential 
of  £ 


Mean- field 
transformation 


Energy  smoothing 
by  replacement  of 
embedded  binary- 
threshold  functions 


Parallel 

decomposition 


Initialization  & 
relaxation 


M 


Binary  representation 
arbitrary  discrete  energy 
function  E 

Network  specification 
including  connection 
weights  and  binary-state, 
"non-neuron-like"  units. 

Simple,  inefficient  network  of 
analog,  "non-neuron-like"  units. 
Functionally  correct  in  the 
high-gain  limit. 

Efficient  network  of  analog, 
"non-neuron-like"  units. 


Network  of  visible 
&  hidden,  analog, 
"neuron-like"  units. 


Network  solution  to  M{ I). 


Approximate  solution  of 
problem  I. 


Figure  8 

Proposed  Thermodynamic  Neural  Network 
Derivation  Procedure 
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Global  Timing  Considerations 


A  number  of  issues  must  be  considered  when  implement  in 
neural  networks.  We  must  specify  network  and  operational 
parameters  and  an  annealing  schedule.  In  continuous-time  sy 
the  response  curves  of  the  components  comprising  the  conn.ec 
and  units  must  be  specified.  In  our  implementations,  time  i 
discrete  so  that  we  must  specify  the  scheduling  of  unit  upd 
n  ~  n  .1  s  ssctior.  w 0  3.  ci  c  ~  0  3  s  *1  r.  0  ^ 3. 3 g  cr  0  n  0  5  0  *1  c  p  1  3  3  —  ^ . c c s 
Giving . 


It  is  well  known  that  Hopfield  and  Tank's  models  press 

deterministic  trajectories  through  configuration  space  duri 

network  relaxation.  As  such,  they  are  subject  to  entrapment 

local  minima  of  E .  Hopfield  and  Tank  modeled  the  non-linear 

response  of  neuron-like  units  with  non-linear  amplifiers.  S 

1 

increasing  the  gain,  A.  (note:  X  -  ~  ),  during  relaxation  deei 

energy  minima  were  located.  In  a  noise-free  simulation,  var 
the  gain  is  inconsequential  and  the  final  configuration  of 
network  depends  solely  on  the  initial  configuration  and  its 
associated  basin  of  attraction.  During  simulation,  r.umerica 
noise  may  introduce  counter-gradient  transitions  and  could 
account  for  Hopfield  and  Tank's  slightly  improved  results. 
Unfortunately,  numerical  noise  is  not  easily  characterized. 

Boltzmann  machines  can  exit  local  minima  because  they 
possess  a  well-defined  generation  mechanism  to  introduce 
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counter-gradient  transitions  or  "uphill  jumps"  in  2. 
Unfortunately,  the  variance  in  the  binary  activation  states  of 
Boltzmann  units  is  relatively  large  and  results  in  unacceptably 
long  relaxation  time  requirements.  In  addition  to  the 
r.on-aeterminist ic  state  transitions  of  Boltzmann  units,  another 
mechanism  for  generating  counter-gradient  transitions  is 
present  -  asvnchronism.  Randomly  probing  units  for  update  is 
equivalent  to  simulating  random  propagation  delays  along 
connections.  In  the  Boltzmann  machine,  the  mean  propagation 
ielay  is  n  update  periods  where  n  is  the  number  of  free-running 
units  ( undamped  units  in  the  terminology  of  Ackley,  Hinton,  and 
Sejr.owski  [Hin84]). 

In  our  implementation,  we  combine  both  the  continuous 
state-space  of  Hopfiela  networks  together  with  the  asvnchronism 
of  the  3oltzmann  machine.  This  is  accomplished  by  using 
deterministic  unit  update  rules  (those  of  networks  G2  and  G3' 
and  by  randomly  scheduling  each  unit  for  update  based  on  a 
discrete  uniform  (l,n)  distribution.  At  update,  each  unit's  next 
state  is  computed  and  immediately  adopted.  This  combination 
yields  rapid  relaxation  with  a  well-defined,  controllable  source 
of  counter-gradient  transitions.  Random  propagation  delays 
induce  gaussian  internal  noise  into  the  activation  value  of  each 
unit.  The  variance  of  the  noise  is  a  function  of  the 
distribution  of  propagation  delays  as  well  as  the  entropy  of  the 
svstem . 
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There  are  several  interesting  interactions  between  internal 
noise  and  the  annealing  process.  We  have  stated  that  internal 
noise  is  partially  dependent  on  the  system  entropy.  As  the 
network  stabilizes  at  a  minima,  the  motion  of  its  global  state 
though  n-dimer.s ionai  space  decreases  and  activation  levels  of 
units  are  modified  by  successively  smaller  amounts.  As  a  result, 
the  effect  of  propagation  delays  becomes  less  p renounced.  For 
example,  let  us  assume  that  the  signal,  from  unit  u  to  unit 

j  is  delayed  along  connection  (i,j)  .  If  unit  i  has  stabi-izeu 
then  Si  will  not  have  changed  since  its  previous  value  was 
received  by  unit  j  so  that  unit  j  is  receives  the  current  value 
of  Si.  In  this  case,  the  signal  delay  has  no  effect.  As 
propagation  delays  effectively  disappear,  the  variance  of 
internal  noise  decreases.  Let  us  briefly  examine  the  dynamics  of 
a  simple  annealed  system  [Kir33j  .  As  T-lO,  fewer  random 

perturbations  are  accepted  and  the  progress  of  the  global  state 
toward  a  minima  degenerates.  Although  it  is  theoretically 
possible  for  the  system  to  make  a  large  counter-gradient 
transition  at  low  temperature,  the  probability  of  such  a 
transition  decreases  exponentially.  We  can  improve  the 
productivity  of  the  annealing  algorithm  by  decreasing  the 
average  "height",  A E,  of  proposed  counter-gradient  transitions 

as  the  system  entropy  decreases.  This  is  precisely  the  effect 
that  is  achieved  by  propagation  delay- induced  internal  noise 
since  its  variance  decreases  as  the  network  configuration 
approaches  a  local  minimum. 
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he  sigmoid  transfer  function 


4-  £ 


has  stable  points  at  0,  and  1.  g  has  a  meta-stable  point  at  0.5. 

when  I 1  >  0  •  5  and  <0-5'  the  SCate'  of  uni-  1  Wli:  have 

a  tendency  to  move  toward  1  and  0  respectively.  If  s-  is  oltse 


(and  unit  i  is  "indecisive"  cr  "teeter inc" )  the  effect 


internal  noise  on  sv  is  more  oronouncea.  m  a  meta-stao. 


o  c  *■  .a  f  a 


even  a  small  amount  of  noise  may  "tip  the  scale"  and  start  the 
the  network  on  a  trajectory  to  a  different  final  configuration. 
As  the  network  stabilizes  and  its  configuration  reaches  a  corner 
of  the  hypercube,  not  only  does  the  variance  of  internal  noise 
recline,  but  the  probability  that  a  noise  surge  will  result  in 
the  transition  of  unit  i  from  s±  <  0 . 5  to  si > 0.5  (cr  the 
converse)  decreases  since  the  difference  between  sv  and  0.5 


increases  as  the  network  settles. 


Simulation 


A  number  of  simulations  were  performed  on  networks  G3  and  G4 . 
In  ail  simulations  an  asynchronous,  sequential  update  rule  like 
that  of  the  Boltzmann  machine  was  used.  Prior  to  relaxation  the 
activation  values  of  all  units  were  initialized  to 
uniform  (0.4,0. 6)  random  numbers.  Simple  fixed-length  annealing 
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Problem  instances  were  randomly  generated  as  a  function  o 
K.  Specifically,  n  ~ U (K-5 ,  K+5 ) ,  w±  ~ U ( 1 , K  )  ,  and  pv  -  U ( 1 , K  )  fc 
1  <i<n.  Several  values  of  K  (10,  20,  35,  and  80)  were  tested. 
For  each  combination  of  network  parameters,  and  for  each  probl 
instance,  10  simulations  were  performed.  After  a  network  was 
constructed  for  each  problem  instance,  the  network  was  relaxed 
using  the  10  or  20  step  annealing  schedule.  The  resulting 
approximate  solutions  were  compared  with  those  produced  by  the 
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Conclusion 


We  have  seen  that  certain  optimization  problems  can  not  be 
embedded  in  neural  networks  that  utilize  simple  mappings  tf 
problem  variables  to  n-dimensior.ai  spaces  of  activation  values  . 
For  these  problems,  it  is  necessary  to  define  hidden  units  that 
detect  different  types  of  features.  Although  we  have  not 
discounted  the  possibility  of  discovering  hidden  units  and  their 

£  a  r  -  3  3  7. 71  IT  0  3  'o  V  2?  3  0  V 0 ITH 0  S  3  /  3  .T. C  IT  3  3  */ 5  7. 017. 3  7. 1.  C  rr’.0  3  'r.3  3  —3 

warranted.  The  method  that  we  have  presented  is  suitable  for 
objective  functions  that  are  not  expressible  as  0-1 
Hamiltonians,  or  those  that  are  not  continuously  differentiable 
ever  the  interior  of  the  unit  hypercube. 

A  range  of  computational  networks  that  with  - 

1  Complex  non-neural,  binary-state  units  (Gl), 

2  Complex  non-neural,  continuous-state  units  (G2,G3), 

3  Simple,  neuron-like,  continuous-state  units  with 
conjunctive  synapses  (G4),  and 

4  Simple,  neuron-like,  continuous-state  units  with 
first-order  synapses  (G5) 

have  been  presented  for  the  integer  knapsack  packing  problem. 
Preliminary  simulation  results  are  promising,  especially  in 
light  of  the  ease  with  which  we  were  able  to  find  viable  network 
parameters.  Simulation  results  indicate  that  these  networks 
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appear  to  scale  reasonably  well .  In.  the  course  of  addin icna_ 
research  we  have  found  the  method  to  be  applicable  to 
bin-packing,  multiprocessor  scheduling,  and  job-sequencing 
emblems  as  well.  We  hope  to  simulate  networks  for  these 
emblems  in  the  near  future. 
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Appendix  A 
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Condition  3 
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Since  v^  > 0  for  1  < i < n,  condition  3  vacuously  false. 
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